This report presents my analysis of a dataset containing 4,335,311 records. I evaluated key fields such as age, postcodes, gender, and other attributes to identify anomalies, missing values, and distribution patterns. The report shows invalid ages, uneven postcode distributions, and incomplete fields through styled tables and interactive charts with color bars that indicate value ranges.
| Metric | Value |
|---|---|
| Total Row Count | 4,335,311 |
| Line-wise Duplicates | 1,243,458 |
| Duplicates (post_code, dob, name) | 4,293,004 |
| Provider | Count |
|---|---|
| Easy Sana | 207,270 |
| Philos | 207,270 |
| Avenir | 207,270 |
| Supra | 207,270 |
| Mutuel Assurance | 206,463 |
| Atupri | 186,081 |
| EGK-Gesundheitskasse | 182,058 |
| CSS | 180,886 |
| CSS - Arcosana | 180,886 |
| CSS - Intras | 180,885 |
| CSS - Sanagate | 180,884 |
| Swica | 178,588 |
| Progrès | 177,633 |
| Helsana | 177,633 |
| Provita | 176,109 |
| ÖKK | 176,107 |
| KVF | 176,107 |
| KPT | 174,848 |
| Compact | 172,324 |
| Sanitas | 172,324 |
| Visana | 150,614 |
| Sana24 | 150,614 |
| Vivacare | 150,614 |
| Innova - Sanvita | 87,287 |
| Innova - Activa | 87,286 |
| Post Code | Count |
|---|---|
| 5000 | 220,838 |
| 9050 | 200,406 |
| 8214 | 187,771 |
| 1473 | 187,009 |
| 8752 | 183,322 |
| 9308 | 182,527 |
| 4147 | 182,479 |
| 6375 | 182,259 |
| 2856 | 182,174 |
| 6110 | 181,875 |
| 9320 | 175,855 |
| 8836 | 175,582 |
| 6517 | 175,442 |
| 1860 | 175,265 |
| 1288 | 173,776 |
| 9100 | 172,542 |
| 6460 | 166,484 |
| 6300 | 162,598 |
| 4001 | 160,935 |
| 4622 | 156,043 |
| 7076 | 151,000 |
| 8914 | 129,034 |
| 3900 | 126,115 |
| 2013 | 125,748 |
| 3267 | 110,689 |
| 6053 | 107,543 |
| Field | Missing Count | Available Count | Available % | Missing % | Class |
|---|---|---|---|---|---|
| post_code | 0 | 4,335,311 | 100.0% | 0.0% | complete |
| dob | 0 | 4,335,311 | 100.0% | 0.0% | complete |
| gender | 0 | 4,335,311 | 100.0% | 0.0% | complete |
| Spitalzusatzversicherung | 1,794,214 | 2,541,097 | 58.61% | 41.39% | complete |
| Franchise | 2,286,487 | 2,048,824 | 47.26% | 52.74% | missing-high |
| Ambulante Zusatzversicherung | 3,813,610 | 521,701 | 12.03% | 87.97% | missing-high |
| Weitere Ergänzungen | 3,379,477 | 955,834 | 22.05% | 77.95% | missing-high |
| Zahnbehandlungen | 4,018,632 | 316,679 | 7.3% | 92.7% | missing-high |
| Unfallzusatz in den Zusatzversicherungen | 0 | 4,335,311 | 100.0% | 0.0% | complete |
| name | 0 | 4,335,311 | 100.0% | 0.0% | complete |
| Zusatzversicherung | 4,335,311 | 0 | 0.0% | 100.0% | missing-high |
| Product name | 4,335,311 | 0 | 0.0% | 100.0% | missing-high |
| Zusatzversicherung 1 | 31,053 | 4,304,258 | 99.28% | 0.72% | complete |
| Product name 1 | 31,053 | 4,304,258 | 99.28% | 0.72% | complete |
| Zusatzversicherung 2 | 3,834,005 | 501,306 | 11.56% | 88.44% | missing-high |
| Product name 2 | 3,834,005 | 501,306 | 11.56% | 88.44% | missing-high |
| age | 0 | 4,335,311 | 100.0% | 0.0% | complete |